Avoiding Substrings in Compositions
نویسندگان
چکیده
A classical result by Guibas and Odlyzko obtained in 1981 gives the generating function for the number of strings that avoid a given set of substrings with the property that no substring is contained in any of the others. In this paper, we give an analogue of this result for the enumeration of compositions that avoid a given set of prohibited substrings, subject to the compositions’ length and weight.
منابع مشابه
Forbidden substrings on weighted alphabets
In an influential 1981 paper, Guibas and Odlyzko constructed a generating function for the number of length n strings over a finite alphabet that avoid all members of a given set of forbidden substrings. Here we extend this result to the case in which the strings are weighted. This investigation was inspired by the problem of counting compositions of an integer n that avoid all compositions of ...
متن کاملMining Peculiar Compositions of Frequent Substrings from Sparse Text Data Using Background Texts
متن کامل
Clustering Documents with Maximal Substrings
This paper provides experimental results showing that we can use maximal substrings as elementary building blocks of documents in place of the words extracted by a current state-of-the-art supervised word extraction. Maximal substrings are defined as the substrings each giving a smaller number of occurrences even by appending only one character to its head or tail. The main feature of maximal s...
متن کاملConcave Compositions
Concave compositions are compositions (i.e. ordered partitions) of a number in which the parts decrease up to the middle summand(s) and increase thereafter. Perhaps the most surprising result is for even length, concave compositions where the generating function turns out to be the quotient of two instances of the pentagonal number theorem with variations of sign. The false theta function disco...
متن کاملEfficient computation of statistics for words with mismatches
Since early stages of bioinformatics, substrings played a crucial role in the search and discovery of significant biological signals. Despite the advent of a large number of different approaches and models to accomplish these tasks, substrings continue to be widely used to determine statistical distributions and compositions of biological sequences at various levels of details. Here we overview...
متن کامل